6 research outputs found

    Performance Optimization and Dynamics Control for Large-scale Data Transfer in Wide-area Networks

    Get PDF
    Transport control plays an important role in the performance of large-scale scientific and media streaming applications involving transfer of large data sets, media streaming, online computational steering, interactive visualization, and remote instrument control. In general, these applications have two distinctive classes of transport requirements: large-scale scientific applications require high bandwidths to move bulk data across wide-area networks, while media streaming applications require stable bandwidths to ensure smooth media playback. Unfortunately, the widely deployed Transmission Control Protocol is inadequate for such tasks due to its performance limitations. The purpose of this dissertation is to conduct rigorous analytical study of the design and performance of transport solutions, and develop an integrated transport solution in a systematical way to overcome the limitations of current transport methods. One of the primary challenges is to explore and compose a set of feasible route options with multiple constraints. Another challenge essentially arises from the randomness inherent in wide-area networks, particularly the Internet. This randomness must be explicitly accounted for to achieve both goodput maximization and stabilization over the constructed routes by suitably adjusting the source rate in response to both network and host dynamics.The superior and robust performance of the proposed transport solution is extensively evaluated in a simulated environment and further verified through real-life implementations and deployments over both Internet and dedicated connections under disparate network conditions in comparison with existing transport methods

    Network-Aware Data Movement Advisor

    No full text
    Next-generation eScience applications often generate large amounts of simulation or experimental data that must be shared and managed by collaborative organizations. Advanced networking technologies and services have been rapidly developed and deployed to facilitate the massive data transport necessary for such data sharing and collaboration. However, these technologies and services have not been fully utilized by application users mainly because their use typically requires significant domain knowledge and in many cases even their existence is not made aware to the public. We design and develop a Network-aware Data Movement Advisor (NADMA) utility to enable automated discovery of network and system resources and advise the user of efficient strategies for fast and successful data transfer. NADMA is primarily a client-end program that interacts with existing data/space management and discovery services such as Storage Resource Management, transport methods such as GridFTP, and network resource provisioning systems such as TeraPaths and OSCARS. NADMA acts as a route planner in a typical vehicle navigation system to provide the user a set of feasible route options along with performance estimations as well as specific steps and commands to authorize and execute data transfer. We demonstrate the efficacy of NADMA in several use cases based on its implementation and deployment in wide-area networks

    On Optimization of Scientific Workflows to Support Streaming Applications in Distributed Network Environments

    No full text
    Large-scale data-intensive streaming applications in various science fields feature complex DAG-structured workflows comprised of distributed computing modules with intricate inter-module dependencies. Supporting such workflows in high-performance network environments and optimizing their throughput are crucial to collaborative scientific exploration and discovery. We formulate workflow mapping as a frame rate optimization problem and propose an efficient heuristic solution, which is integrated into the Condor-based Scientific Workflow Automation and Management Platform (SWAMP) in place of Condor\u27s default mapping scheme. The SWAMP system is also augmented with several new components to improve the workflow management process. The performance superiority of the proposed solution is verified using both simulations and a real-life scientific workflow for climate modeling deployed in a distributed heterogeneous network environment

    A Distributed Workflow Management System with Case Study of Real-Life Scientific Applications on Grids

    No full text
    Next-generation scientific applications feature complex workflows comprised of many computing modules with intricate inter-module dependencies. Supporting such scientific workflows in wide-area networks especially Grids and optimizing their performance are crucial to the success of collaborative scientific discovery. We develop a Scientific Workflow Automation and Management Platform (SWAMP), which enables scientists to conveniently assemble, execute, monitor, control, and steer computing workflows in distributed environments via a unified web-based user interface. The SWAMP architecture is built entirely on a seamless composition of web services: the functionalities of its own are provided and its interactions with other tools or systems are enabled through web services for easy access over standard Internet protocols while being independent of different platforms and programming languages. SWAMP also incorporates a class of efficient workflow mapping schemes to achieve optimal end-to-end performance based on rigorous performance modeling and algorithm design. The performance superiority of SWAMP over existing workflow mapping schemes is justified by extensive simulations, and the system efficacy is illustrated by large-scale experiments on real-life scientific workflows for climate modeling through effective system implementation, deployment, and testing on the Open Science Grid

    Automation and Management of Scientific Workflows in Distributed Network Environments

    No full text
    Large-scale computation-intensive applications in various science fields feature complex DAG-structured workflows comprised of distributed computing modules with intricate inter-module dependencies. Supporting such workflows in heterogeneous network environments and optimizing their end-to-end performance are crucial to the success of large-scale collaborative scientific applications. We design and develop a generic Scientific Workflow Automation and Management Platform (SWAMP), which contains a set of easy-to-use computing and networking toolkits for application scientists to conveniently assemble, execute, monitor, and control complex computing workflows in distributed network environments. The current version of SWAMP integrates the graphical user interface of Kepler to compose abstract workflows and employs Condor DAGMan for workflow dispatch and execution. SWAMP provides a web-based user interface to automate and manage workflow executions and uses a special workflow mapper to optimize the end-to-end workflow performance. A case study of the workflow for Spallation Neutron Source datasets in real networks is presented to show the efficacy of the proposed platform
    corecore